Search CORE

51 research outputs found

Randomized Maximum Entropy Language Models

Author: Asela Gunawardana
Puyang Xu
Sanjeev Khudanpur
Publication venue
Publication date
Field of study

Abstract—We address the memory problem of maximum entropy language models(MELM) with very large feature sets. Randomized techniques are employed to remove all large, exact data structures in MELM implementations. To avoid the dictionary structure that maps each feature to its corresponding weight, the feature hashing trick [1] [2] can be used. We also replace the explicit storage of features with a Bloom filter. We show with extensive experiments that false positive errors of Bloom filters and random hash collisions do not degrade model performance. Both perplexity and WER improvements are demonstrated by building MELM that would otherwise be prohibitively large to estimate or store. I

CiteSeerX

Convolutional neural network based triangular CRF for joint intent detection and slot filling,”

Author: Puyang Xu
Ruhi Sarikaya
Publication venue
Publication date: 01/01/2013
Field of study

ABSTRACT We describe a joint model for intent detection and slot filling based on convolutional neural networks (CNN). The proposed architecture can be perceived as a neural network (NN) version of the triangular CRF model (TriCRF), in which the intent label and the slot sequence are modeled jointly and their dependencies are exploited. Our slot filling component is a globally normalized CRF style model, as opposed to left-toright models in recent NN based slot taggers. Its features are automatically extracted through CNN layers and shared by the intent model. We show that our slot model component generates state-of-the-art results, outperforming CRF significantly. Our joint model outperforms the standard TriCRF by 1% absolute for both intent and slot. On a number of other domains, our joint model achieves 0.7 -1%, and 0.9 -2.1% absolute gains over the independent modeling approach for intent and slot respectively

CiteSeerX

Energy distribution and economic growth:an empirical test for China

Author: Elliott Robert J.r.
Sun Puyang
Xu Qiqin
Publication venue: 'Elsevier BV'
Publication date: 01/03/2015
Field of study

Crossref

University of Birmingham Research Portal

Deep Contextual Language Understanding in Spoken Dialogue Systems

Author: Chunxi Liu
Puyang Xu
Ruhi Sarikaya
Publication venue
Publication date: 24/04/2020
Field of study

Abstract We describe a unified multi-turn multi-task spoken language understanding (SLU) solution capable of handling multiple context sensitive classification (intent determination) and sequence labeling (slot filling) tasks simultaneously. The proposed architecture is based on recurrent convolutional neural networks (RCNN) with shared feature layers and globally normalized sequence modeling components. The temporal dependencies within and across different tasks are encoded succinctly as recurrent connections. The dialog system responses beyond SLU component are also exploited as effective external features. We show with extensive experiments on a number of datasets that the proposed joint learning framework generates state-of-the-art results for both classification and tagging, and the contextual modeling based on recurrent and external features significantly improves the context sensitivity of SLU models

CiteSeerX

Open World Classification with Adaptive Negative Samples

Author: Bai Ke
Carin Lawrence
Henao Ricardo
Lee Sungjin
Li Jiwei
Park Sunghyun
Wang Guoyin
Xu Puyang
Publication venue
Publication date: 09/03/2023
Field of study

Open world classification is a task in natural language processing with key practical relevance and impact. Since the open or {\em unknown} category data only manifests in the inference phase, finding a model with a suitable decision boundary accommodating for the identification of known classes and discrimination of the open category is challenging. The performance of existing models is limited by the lack of effective open category data during the training stage or the lack of a good mechanism to learn appropriate decision boundaries. We propose an approach based on \underline{a}daptive \underline{n}egative \underline{s}amples (ANS) designed to generate effective synthetic open category samples in the training stage and without requiring any prior knowledge or external datasets. Empirically, we find a significant advantage in using auxiliary one-versus-rest binary classifiers, which effectively utilize the generated negative samples and avoid the complex threshold-seeking stage in previous works. Extensive experiments on three benchmark datasets show that ANS achieves significant improvements over state-of-the-art methods.Comment: Accepted by EMNLP 2021 (Main Track, Long Paper

arXiv.org e-Print Archive

Continual Segment: Towards a Single, Unified and Accessible Continual Segmentation Model of 143 Whole-body Organs in CT Scans

Author: Gao Mingchen
Ge Jia
Guo Dazhou
Ji Zhanghexuan
Jin Dakai
Lu Le
Wang Puyang
Wang Qifeng
Xu Minfeng
Yan Ke
Ye Xianghua
Zhou Jingren
Publication venue
Publication date: 13/03/2023
Field of study

Deep learning empowers the mainstream medical image segmentation methods. Nevertheless current deep segmentation approaches are not capable of efficiently and effectively adapting and updating the trained models when new incremental segmentation classes (along with new training datasets or not) are required to be added. In real clinical environment, it can be preferred that segmentation models could be dynamically extended to segment new organs/tumors without the (re-)access to previous training datasets due to obstacles of patient privacy and data storage. This process can be viewed as a continual semantic segmentation (CSS) problem, being understudied for multi-organ segmentation. In this work, we propose a new architectural CSS learning framework to learn a single deep segmentation model for segmenting a total of 143 whole-body organs. Using the encoder/decoder network structure, we demonstrate that a continually-trained then frozen encoder coupled with incrementally-added decoders can extract and preserve sufficiently representative image features for new classes to be subsequently and validly segmented. To maintain a single network model complexity, we trim each decoder progressively using neural architecture search and teacher-student based knowledge distillation. To incorporate with both healthy and pathological organs appearing in different datasets, a novel anomaly-aware and confidence learning module is proposed to merge the overlapped organ predictions, originated from different decoders. Trained and validated on 3D CT scans of 2500+ patients from four datasets, our single network can segment total 143 whole-body organs with very high accuracy, closely reaching the upper bound performance level by training four separate segmentation models (i.e., one model per dataset/task)

arXiv.org e-Print Archive

Serological analysis of allergic components of house dust mite provides more insight in epidemiological characteristics and clinical symptom development in North China

Author: Jiaofeng Wang
Jin-Lyu Sun
Lan Zhao
Lan Zhao
Lishan Zhang
Mingzhi Zhu
Puyang Xu
Shandong Wu
Xukai Yang
Yi Liu
Yi Liu
Yifei Wang
Yinshi Guo
Zhongshan Gao
Zhongshan Gao
Zhoujie Wu
Publication venue: 'Frontiers Media SA'
Publication date: 01/04/2023
Field of study

BackgroundHouse dust mite (HDM) is the most common airborne source causing complex allergy symptoms. There are geographic differences in the allergen molecule sensitization profiles. Serological testing with allergen components may provide more clues for diagnosis and clinical management.ObjectiveThis study aims to investigate the sensitization profile of eight HDM allergen components in a large number of patients enrolled in the clinic and to analyze the relation of gender, age, and clinical symptoms in North China.MethodsThe 548 serum samples of HDM-allergic patients (ImmunoCAP® d1 or d2 IgE ≥0.35) were collected in Beijing City and divided in four different age groups and three allergic symptoms. The specific IgE of HDM allergenic components, Der p 1/Der f 1, Der p 2/Der f 2, Der p 7, Der p 10, Der p 21, and Der p 23, was measured using the micro-arrayed allergen test kit developed by Hangzhou Zheda Dixun Biological Gene Engineering Co., Ltd. The new system was validated by comparing to single-component Der p 1, Der p 2, and Der p 23 tests by ImmunoCAP in 39 sera. The epidemiological study of these IgE profiles and the relation to age and clinical phenotypes were analyzed.ResultsA greater proportion of male patients was in the younger age groups, while more female patients were in the adult groups. Both the sIgE levels and the positive rates (approximately 60%) against Der p 1/Der f 1 and Der p 2/Der f 2 were higher than for the Der p 7, Der p 10, and Der p 21 components (below 25%). The Der f 1 and Der p 2 positive rates were higher in 2–12-year-old children. The Der p 2 and Der f 2 IgE levels and positive rates were higher in the allergic rhinitis group. The positive rates of Der p 10 increased significantly with age. Der p 21 is relevant in allergic dermatitis symptom, while Der p 23 contributes to asthma development.ConclusionHDM groups 1 and 2 were the major sensitizing allergens, with group 2 being the most important component relevant to respiratory symptoms in North China. The Der p 10 sensitization tends to increase with age. Der p 21 and Der p 23 might be associated with the development of allergic skin disease and asthma, respectively. Multiple allergen sensitizations increased the risk of allergic asthma

Directory of Open Access Journals